Skip to content

Conversation

lukewhiting
Copy link
Contributor

This PR disabled the datastream autosharding for LOOKUP indices to prevent them scaling above 1 replica which is unsupported by the lookup mappers.

Fixes ES-12330

@lukewhiting lukewhiting added >bug :Data Management/Data streams Data streams and their lifecycles auto-backport Automatically create backport pull requests when merged v9.0.0 v9.1.0 v9.2.0 v8.19.1 v8.18.5 labels Jul 17, 2025
@lukewhiting lukewhiting removed the request for review from PeteGillinElastic July 17, 2025 10:41
@elasticsearchmachine
Copy link
Collaborator

Hi @lukewhiting, I've created a changelog YAML for you.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR prevents auto-sharding functionality for data streams that use LOOKUP index mode, as lookup mappers don't support scaling beyond 1 replica. The implementation adds an early return in the auto-sharding calculation logic when the index mode is LOOKUP.

  • Adds a check in the auto-sharding service to return NOT_APPLICABLE_RESULT for LOOKUP index mode data streams
  • Includes comprehensive test coverage for both scenarios with and without index statistics

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
DataStreamAutoShardingService.java Adds LOOKUP index mode check to prevent auto-sharding calculation
DataStreamAutoShardingServiceTests.java Adds test cases to verify auto-sharding is disabled for LOOKUP index mode
Comments suppressed due to low confidence (2)

server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java:1347

  • [nitpick] The test method name could be more descriptive. Consider renaming to 'testCalculateReturnsNotApplicableForLookupIndexModeWithStats' to better distinguish it from the null stats test case.
    public void testCalculateReturnsNotApplicableForLookupIndexMode() {

@lukewhiting lukewhiting marked this pull request as ready for review July 25, 2025 09:37
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jul 25, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

return NOT_APPLICABLE_RESULT;
}

if (dataStream.getIndexMode() == IndexMode.LOOKUP) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change lgtm!

if you wouldn't mind me asking a few (hopefully not dumb) Qs just to learn, just if you know them off the top of your head, i can investigate anything you might not know

  1. terminology: it seems in the code and slack discussion regarding the ticket, it seems we use the term shard to only mean write shard, but we also have replica/read shards right? just wondering whether whenever i see the term shard should i basically assume this means write shard/primary
  2. so LOOKUP is a type of index, that only only ever has one primary shard, but can it have more than 1 replica/read shard? assuming it can, do we have any replica auto-sharding in place in data streams if the read load gets too heavy?
  3. why does LOOKUP only ever allow one primary shard? it's always possible it has heavy writes (hence the error for scaleup i'm assuming), is it basically just a mode where you're assuming up-front write volume is low and it's -- as in the name -- just a lookup
  4. for this auto-sharding result logic, if we scale up, do we roll-over to a new index with more primaries or split? and if rollover the new index has more primaries right
  5. for the ticket, did the lookup index just completely fail to rollover and kept accumulating data, or it just threw the validation exceptions as this kept being called to scale-up but eventually was rolled over due to size/time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szybia We can chat about this in our 1:1 tomorrow if you like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please! might take the load off luke...

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM also

@lukewhiting lukewhiting merged commit ea22dff into elastic:main Jul 29, 2025
33 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
9.0 Commit could not be cherrypicked due to conflicts
9.1
8.19 Commit could not be cherrypicked due to conflicts
8.18 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 131429

lukewhiting added a commit to lukewhiting/elasticsearch that referenced this pull request Jul 29, 2025
…131429)

* Prevent auto-sharding for data streams in LOOKUP index mode

* Update docs/changelog/131429.yaml

* Reduce test duplication
lukewhiting added a commit to lukewhiting/elasticsearch that referenced this pull request Jul 29, 2025
…131429)

* Prevent auto-sharding for data streams in LOOKUP index mode

* Update docs/changelog/131429.yaml

* Reduce test duplication

(cherry picked from commit ea22dff)

# Conflicts:
#	server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java
#	server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java
@lukewhiting
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
9.0
8.19
8.18

Questions ?

Please refer to the Backport tool documentation

lukewhiting added a commit to lukewhiting/elasticsearch that referenced this pull request Jul 29, 2025
…131429)

* Prevent auto-sharding for data streams in LOOKUP index mode

* Update docs/changelog/131429.yaml

* Reduce test duplication

(cherry picked from commit ea22dff)

# Conflicts:
#	server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java
#	server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java
elasticsearchmachine pushed a commit that referenced this pull request Jul 29, 2025
…#132073)

* Prevent auto-sharding for data streams in LOOKUP index mode

* Update docs/changelog/131429.yaml

* Reduce test duplication
elasticsearchmachine pushed a commit that referenced this pull request Jul 29, 2025
…#132079)

* Prevent auto-sharding for data streams in LOOKUP index mode

* Update docs/changelog/131429.yaml

* Reduce test duplication

(cherry picked from commit ea22dff)

# Conflicts:
#	server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java
#	server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java
elasticsearchmachine pushed a commit that referenced this pull request Jul 29, 2025
…#132082)

* Prevent auto-sharding for data streams in LOOKUP index mode

* Update docs/changelog/131429.yaml

* Reduce test duplication

(cherry picked from commit ea22dff)

# Conflicts:
#	server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java
#	server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java
elasticsearchmachine pushed a commit that referenced this pull request Jul 29, 2025
…#132080)

* Prevent auto-sharding for data streams in LOOKUP index mode

* Update docs/changelog/131429.yaml

* Reduce test duplication

(cherry picked from commit ea22dff)

# Conflicts:
#	server/src/main/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingService.java
#	server/src/test/java/org/elasticsearch/action/datastreams/autosharding/DataStreamAutoShardingServiceTests.java
@lukewhiting lukewhiting deleted the ES-12330-prevent-auto-shard-on-lookup-index branch July 29, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >bug :Data Management/Data streams Data streams and their lifecycles Team:Data Management Meta label for data/management team v8.18.5 v8.19.1 v9.0.0 v9.1.0 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants